A Review of Glottal Waveform Analysis

نویسندگان

  • Jacqueline Walker
  • Peter J. Murphy
چکیده

Glottal inverse filtering is of potential use in a wide range of speech processing applications. As the process of voice production is, to a first order approximation, a source-filter process, then obtaining source and filter components provides for a flexible representation of the speech signal for use in processing applications. In certain applications the desire for accurate inverse filtering is more immediately obvious, e.g., in the assessment of laryngeal aspects of voice quality and for correlations between acoustics and vocal fold dynamics, the resonances of the vocal tract should firstly be removed. Similarly, for assessment of vocal performance, trained singers may wish to obtain quantitative data or feedback regarding their voice at the level of the larynx. In applications where the extracted glottal signal is not of primary interest in itself the goal of accurate glottal inverse filtering remains important. In a number of speech processing applications a flexible representation of the speech signal, e.g., harmonics plus noise modelling (HNM) [74] or sinusoidal modelling [65], is required to allow for efficient modification of the signal for speech enhancement, voice conversion or speech synthesis. In connected speech it is the glottal source (including the fundamental frequency) that changes under a time-varying vocal tract and hence an optimum representation should track glottal and filter changes. Another potential application of glottal inverse filtering is speech coding, either in a representation similar to HNM, for example (but incorporating a glottal source), or as in [3], [11], [19] applying coding strategies termed glottal excited linear prediction (GELP) which use glottal flow waveforms to replace the residual or random waveforms used in existing code excited linear prediction (CELP) codecs. In the studies cited the perceptual quality of the GELP codecs is similar to that of CELP. The speaker identification characteristics of glottal parameters have also recently undergone preliminary investigation [64] (note this is quite different from investigating the speaker identification characteristics of the linear prediction residual signal). Identification accuracy up to approximately 70% is reported using glottal parameters alone. Future studies employing explicit combinations of glottal and filter components may provide much higher identification rates. In addition, the more that is understood regarding glottal changes in connected speech

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parameterisation Methods of the Glottal Flow Estimated by Inverse Filtering

Estimation of the source of voiced speech, the glottal volume velocity waveform, with inverse filtering involves usually a parameterisation stage, where the obtained flow waveforms are expressed in numerical forms. This stage of the voice source analysis, the parameterisation of the glottal flow, is discussed in the present paper. The paper aims to give a review on the different methods develop...

متن کامل

Improving Glottal Waveform Rank-based Glottal Qua

Information on the glottal waveform is an important part of many speech applications. However, glottal waveform estimation remains one of the more inexact sciences of speech processing. The work presented here describes an enhancement to a recently presented algorithm by a new technique involving Rank-Based Glottal Quality Assessment (RB-GQA). The basic premise is to investigate potential measu...

متن کامل

Depression Detection & Emotion Classification via Data-Driven Glottal Waveforms

This doctoral consortium paper outlines the author’s proposed investigation into the use of the voice-source waveform for affective computing. A data-driven glottal waveform representation, previously examined in the authors earlier doctoral studies for its speaker discriminative abilities, is proposed to be studied for both depression detection and emotion recognition, including severity class...

متن کامل

Glottal closure and opening detection for flexible parametric voice coding

The knowledge of glottal closure and opening instants (GCI/GOI) is useful for many speech analysis applications. A Pitchsynchronous waveform encoding of voice is one such application. In this paper, a dynamic programming is employed to solve for the global close/open phase segmentation based on the polynomial parametric waveform of the derivative glottal waveform and its quasi-periodicity. Not ...

متن کامل

Voice source analysis using biomechanical modeling and glottal inverse filtering

This paper studies the use of glottal inverse filtering together with a biomechanical model of the vocal folds to simulate the glottal flow waveform. The glottal flow waveform is first estimated by inverse filtering the acoustic speech pressure signal of natural speech. The estimated glottal flow is used as a template in an optimization process which searches for a set of parameters for a deter...

متن کامل

Glottal Waveforms for Speaker Inference & A Regression Score Post-Processing Method Applicable to General Classification Problems

Contributions are made along two main lines. Firstly a method is proposed for using a regression model to learn relationships within the scores of a machine learning classifier, which can then be applied to future classifier output for the purpose of improving recognition accuracy. The method is termed r-norm and strong empirical results are obtained from its application to several text-indepen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005